Σ-Optimality for Active Learning on Gaussian Random Fields
نویسندگان
چکیده
A common classifier for unlabeled nodes on undirected graphs uses label propagation from the labeled nodes, equivalent to the harmonic predictor on Gaussian random fields (GRFs). For active learning on GRFs, the commonly used V-optimality criterion queries nodes that reduce the L (regression) loss. V-optimality satisfies a submodularity property showing that greedy reduction produces a (1− 1/e) globally optimal solution. However, L loss may not characterise the true nature of 0/1 loss in classification problems and thus may not be the best choice for active learning. We consider a new criterion we call Σ-optimality, which queries the node that minimizes the sum of the elements in the predictive covariance. Σ-optimality directly optimizes the risk of the surveying problem, which is to determine the proportion of nodes belonging to one class. In this paper we extend submodularity guarantees from V-optimality to Σ-optimality using properties specific to GRFs. We further show that GRFs satisfy the suppressor-free condition in addition to the conditional independence inherited from Markov random fields. We test Σoptimality on real-world graphs with both synthetic and real data and show that it outperforms V-optimality and other related methods on classification.
منابع مشابه
Data Analysis Project: Σ-Optimality for Active Learning on Gaussian Random Fields
A common classifier for unlabeled nodes on undirected graphs uses label propagation from the labeled nodes, equivalent to the harmonic predictor on Gaussian random fields (GRFs). For active learning on GRFs, the commonly used V-optimality criterion queries nodes that reduce the L (regression) loss. V-optimality satisfies a submodularity property showing that greedy reduction produces a (1− 1/e)...
متن کاملSubmodularity in Batch Active Learning and Survey Problems on Gaussian Random Fields
Many real-world datasets can be represented in the form of a graph whose edge weights designate similarities between instances. A discrete Gaussian random field (GRF) model is a finite-dimensional Gaussian process (GP) whose prior covariance is the inverse of a graph Laplacian. Minimizing the trace of the prediction covariance Σ (V-optimality) on GRFs has proven successful in batch active learn...
متن کاملActive Search and Bandits on Graphs using Sigma-Optimality
Many modern information access problems involve highly complex patterns that cannot be handled by traditional keyword based search. Active Search is an emerging paradigm that helps users quickly find relevant information by efficiently collecting and learning from user feedback. We consider active search on graphs, where the nodes represent the set of instances users want to search over and the...
متن کاملSubset Selection for Gaussian Markov Random Fields
Given a Gaussian Markov random field, we consider the problem of selecting a subset of variables to observe which minimizes the total expected squared prediction error of the unobserved variables. We first show that finding an exact solution is NP-hard even for a restricted class of Gaussian Markov random fields, called Gaussian free fields, which arise in semi-supervised learning and computer ...
متن کاملAn optimality result about sample path properties of Operator Scaling Gaussian Random Fields
We study the sample paths properties of Operator scaling Gaussian random fields. Such fields are anisotropic generalizations of anisotropic self-similar random fields as anisotropic Fractional Brownian Motion. Some characteristic properties of the anisotropy are revealed by the regularity of the sample paths. The sharpest way of measuring smoothness is related to these anisotropies and thus to ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013